

## A 50-Gbit/s 1:4 Demultiplexer IC in InP-based HEMT Technology

H. Kano, T. Suzuki, S. Yamaura, Y. Nakasha, K. Sawada, T. Takahashi, K. Makiyama, T. Hirose, and Y. Watanabe

Fujitsu Laboratories, Ltd., 10-1 Morinosato-Wakamiya, Atsugi 243-0197, Japan



**Abstract** — We have developed a 50-Gb/s 1:4 demultiplexer (DEMUX) integrated circuit with a wide phase margin of 108 degrees in 0.13- $\mu$ m InP-based HEMT technology. To increase the phase margin, we designed the data and clock distribution with the aim of achieving high symmetry and eliminating multiple reflections. The measured performance of the fabricated 1:4 DEMUX was suitable for practical use in 50-Gbit/s-class applications.

### I. INTRODUCTION

The rapid growth of the Internet has created strong demand for much greater transmission capacity in backbone optical networks. Over-40-Gbit/s time-division multiplexing transmission (TDM) systems are now being developed, and the demultiplexer (DEMUX) will be a key digital circuit in such systems. Several DEMUXs for these systems have been reported [1], [2]. However, there have been few papers on over-50-Gbit/s DEMUXs, and no reports on a technique to expand the phase margin for practical use. We thus optimized the design of a 1:4 DEMUX capable of operating at 50 Gbit/s to expand the phase margin, and we have demonstrated this device's effectiveness.

### II. CIRCUIT DESIGN

Our half-rate-clock 1:4 DEMUX IC is based on a tree structure of 1:2 DEMUX cells. The IC consists of two single-ended input buffers for 50-Gbit/s data and a 25-GHz clock, three 1:2 DEMUX cells, a T-FF, and five differential output buffers (Fig. 1). The 1:2 DEMUX cells each consist of a master-slave delayed flip-flop (MS D-FF) and a master-slave-master delayed flip-flop (MSM D-FF). A selector (SEL) switches the output from a 12.5-GHz clock synchronized with 1:4 demultiplexed data to 1:2 demultiplexed data as a monitor. These circuits are based on source-coupled FET logic (SCFL). To obtain a good transition time, peaking inductors are employed with load resistors in a source-coupled differential circuit. Speed-up capacitors were introduced in the source followers to compensate for the losses of level shift diodes.

The 1:2 DEMUX cell at the first stage of the tree structure operates at the maximum speed and is the most

critical part of the IC. Since we improved our previously reported D-FF [3] to be capable of over-50-Gbit/s operation, we focused especially on the data and clock distribution to ensure that neither data skew nor clock skew would reduce the IC's phase margin. Figure 2 shows the relationship between each skew and the phase margin of the 1:2 DEMUX cell when driven by a half-rate clock. The clock input to the MS D-FF is inverted relative to the clock input to the MSM D-FF so that the two D-FFs can alternately latch data at the down edge of each clock input. Figures 2(a) and (b) show, respectively, timing charts without and with skews. If the two data signals distributed to the MSM D-FF and the MS D-FF have skew  $T_{ds}$ , the phase margin decreases to  $T_{pm} - T_{ds}$  ( $= T_{pm}'$ ). Moreover, if the two clock inputs to the D-FFs have skew  $T_{cs}$ , the phase margin decreases to  $T_{pm}' - T_{cs}$ .

There are two reasons that these skews occur. One is the asymmetry of the layout—for example, different length between two lines for the same signal. The other is waveform degradation due to multiple reflections. Therefore, the 1:2 DEMUX cell was designed to achieve symmetry in the layout and eliminate waveform degradation. We explain below how we distribute the data and clock signals.



Fig. 1. Block diagram of the 1:4 DEMUX.



Fig. 2. Timing charts of the 1:2 DEMUX cell (a) without skew and (b) with skew.

#### A. Dividing Data

If a signal line from an input buffer to the two D-FFs is simply divided, the total length of the divided lines will be about 400  $\mu\text{m}$  (Fig. 3(a)). Large passive elements such as peaking inductors or speed-up capacitors occupy space in the D-FF, and they keep their inputs away from each other. A simulated waveform at the end of the divided line is shown in Fig. 3(a). Jitter due to multiple reflections was observed during the transition, because there are impedance mismatches at the connected gates and the total line length is more than one-tenth of the electric wavelength, which is 3.2 mm at 50 GHz in the microstrip line. The index ( $\lambda/10$ ) was set so that the peak-to-peak jitter at the cross point became less than 1 ps in simulation results for inter-cell connection (Fig. 4).

To eliminate the degradation, we designed a divider that provides symmetry between the divided lines and shortens

the divided line length. It consists of a differential pair and two source followers (Fig. 3(b)). The data path is divided at the output of the differential pair and each divided line is connected to a source follower. Since we can place source followers close to each other, the divided data lines can be made shorter. The line length from the differential pair to the source follower is 50  $\mu\text{m}$ , and the length from the source follower to the D-FF is 80  $\mu\text{m}$ . The simulated waveform was observed with little degradation due to multiple reflections. The simulation results indicated that the peak-to-peak jitter was reduced from 1.5 to 0.9 ps (Fig. 3(b)).



Fig. 3. Simulated eye diagrams (a) with a divided line and (b) with the proposed divider. (Dif: differential pair, SF: source follower)



Fig. 4. Simulated data jitter due to reflection.

### B. Clock Distribution

For the data signal, since waveform degradation due to signal reflection depends on the data patterns, jitter is generated. In contrast, since the clock signal is a series consisting of one pattern alternating one and zero, the pattern jitter of the clock signal is less than that of the data signal, and in fact, not critical. The skew between the divided lines is more critical. If the divided lines are long and asymmetrical, signals with different degraded waveforms are provided to the next inputs and skew occurs. Because it is difficult to maintain the symmetry of the layout in the digital circuit, the divided lines must be shortened to enable the signals to have the same waveform. However, the divider is not suited to the 1:2 DEMUX cell, because its layout is so large that the other signal lines, such as the divided data line, would become longer. We thus propose a clock tree so that one main clock line is connected to five D-latch inputs by short lines (Fig. 5(a)). In addition, the layout of the D-FFs employs orthogonal clock and data inputs (Fig. 5(b)). Consequently, the transistors for the clock input are placed close together to minimize the length of the divided clock line. In distributing the clock signal, the clock skew is reduced between the inputs of the first D-latches that receive data signals in the MSM D-FF and the MS D-FF.

### III. FABRICATION

We used 0.13- $\mu$ m InP-based HEMTs [4] to fabricate the 1:4 DEMUX. The  $g_m$  was 1034 mS/mm and the  $f_T$  was 173 GHz. The  $V_{th}$  was -0.67 V. Three-level Au layers and benzocyclobutene (BCB) inter-layer films were used to enable high-speed signal transmission. The first Au layer mainly supplied power, while the second layer was the ground. We could then easily build a microstrip line (MSL) with the third layer on top of the second layer ground to use for inter-cell connection. The characteristic impedance of the MSL was as low as 52  $\Omega$ , which resulted in better matching to the output impedances of the source followers. Metal-insulator-metal (MIM) capacitors were formed between the second and third layers. NiCr resistors were formed on the InP substrate. The chip size was 4.0 x 3.1 mm, and the chip contained 972 transistors, 446 resistors, and 113 capacitors. A micrograph of the chip is shown in Fig. 6.

### IV. EXPERIMENTAL RESULTS

Figure 7(a) shows eye diagrams obtained during operation at 50 Gbit/s for a  $2^{31}-1$  PRBS input. Clear eye openings were obtained for all output channels with a

voltage swing of 750 mV. The demultiplexing operation of the IC was confirmed up to 50 Gbit/s, where the phase margin was 108 degrees. This demonstrates that the proposed design is effective. Figure 7(b) shows the waveforms of a function test at 50 Gbit/s. We checked the functions of the IC by using data-sequence input. The power supply voltage was -5.2 V and the power consumption was 4.7 W.



Fig. 5. Layout of clock distribution (a) in the 1:2 DEMUX cell, and (b) at the inputs of the first D-latches (the part enclosed with an oval in (a)).



Fig. 6. Micrograph of the 1:4 DEMUX.

## V. CONCLUSION

We have fabricated and tested a 50-Gbit/s 1:4 DEMUX IC using 0.13- $\mu$ m InP-based HEMTs. The key design element is the data and clock distribution in the 1:2 DEMUX cell, which operates at the fastest bit rate. The IC was successfully operated at 50 Gbit/s with a wide phase margin of 108 degrees. These results indicate that this device is feasible for use in 50-Gbit/s-class applications.

## ACKNOWLEDGEMENT

We thank M. Mizoguchi for design support, H. Ito for measurement assistance, M. Nishi for device processing, and H. Shigematsu for helpful discussions with us.

## REFERENCES

- [1] A. Felder et al., "46 Gb/s DEMUX, 50 Gb/s MUX, and 30 GHz Static Frequency Divider in Silicon Bipolar Technology", IEEE Journal of Solid-State Circuits, Vol. 31, No. 4, April 1996.
- [2] K. Sano et al., "50-Gbit/s Demultiplexer IC Module Using InAlAs/InGaAs/InP HEMTs", IEICE Trans. Electron., Vol.E83-C, No.11 November 2000.
- [3] T. Suzuki et al., "40-Gbit/s D-type Flip-Flop and Multiplexer Circuits Using InP HEMT", 2001 IEEE MTT-S Digest.
- [4] T. Takahashi et al., "Stable and Uniform InAlAs/InGaAs HEMT ICs for 40-Gbit/s Optical Communication Systems," Proceedings of the 2001 IPRM, pp. 614-617



Fig. 7. Experimental results at 50 Gbit/s: (a) output eye diagrams, and (b) sections of the output pulse sequence.